Exploiting Consistency Theory for Modeling Twitter Hashtag Adoption
نویسنده
چکیده
Twitter, a microblogging service, has evolved into a powerful communication platform with millions of active users who generate immense volume of microposts on a daily basis. To facilitate effective categorization and easy search, users adopt hashtags, keywords or phrases preceded by hash (#) character. Successful prediction of the spread and propagation of information in the form of trending topics or hashtags in Twitter, could help real time identification of new trends and thus improve marketing efforts. Social theories such as consistency theory suggest that people prefer harmony or consistency in their thoughts. In Twitter, for example, users are more likely to adopt the same trending hashtag multiple times before it eventually dies. In this paper, we propose a low-rank weighted matrix factorization approach to model trending hashtag adoption in Twitter based on consistency theory. In particular, we first cast the problem of modeling trending hashtag adoption into an optimization problem, then integrate consistency theory into it as a regularization term and finally leverage widely used matrix factorization to solve the optimization. Empirical experiments demonstrate that our method outperforms other baselines in predicting whether a specific trending hashtag will be used by users in future. Twitter1, a prevalent and well-known microblogging Website allows millions of active users interacting with each other and posting tweets, a message up to 140 characters, per day on computers or mobile devices. Twitter is popular for massive spreading of tweets and the nature of freedom. Daily bursts of news, gossips, rumors, discussions and many others are all exchanged and shared by users all over the world, no matter where they come from, civilized or uneducated, or even what religion they hold. Consequently, users on Twitter are easily overwhelmed by the tremendous volume of data. To ease the task of categorization and following up with the trends, Twitter has allowed users to freely assign valid hashtags to their tweets, i.e. strings prefixed by the hash ”#” character. Hahstags could help users categorize their own posts and thus represent a coarse-grained topic of the content. This mechanism is a community-driven convention for Copyright c © 2015, https://twitter.com adding additional context to tweets. In particular, it helps tweet search and quickly propagation of the topic among millions of users by allowing them to join the discussion. Hashtags could be viewed as topical markers to indicate the core idea expressed in the tweet and hence are adopted by users who contribute similar content. Trending hashtags are those hashtags which receive extensive attention in a strictly short period of time due to certain reasons but eventually die at some point. Understanding the spread and propagation of such information through Twitter has many immediate applications such as targeting users for marketing purposes, identification of trends and enhancing marketing efforts, or even socio-political events and large natural disasters. Consistency theory (Abelson 1983) is a social theory which suggests that people prefer harmony or consistency in their inner systems (beliefs, attidues, thoughts, etc.). In other words, when things fall out of alignment, the discomfort of cognitive dissonance occurs to help people keep their practical level of consistency in their lives by motivating them to change their thoughts to restore consistency. An example of the consistency theory in Twitter is that the hashtags adopted by the same user are more likely to be consistent than those of two randomly chosen hashtags. Based on the consistency theory, we envision that users that have previously adopted a certain trending hashtag in past, are more likely to adopt it again in future. In this study we aim at modeling the information spread in the form of trending hashtag adoption in Twitter based on matrix factorization scheme and consistency theory. Our main contributions are then as follows: • We perform two-sample t-test to verify that users are more likely to adopt the same hashtag multiple times and hence possess consistent hashtag usage history. • We formulate the problem of trending hashtag adoption prediction into an optimization problem and integrate consistency theory into it. To take into account the fact that trending hashtags do not last long, we further incorporate attenuation matrix into the optimization equation. We use low-rank weighted matrix factorization model to solve the optimization equation and propose hCWMF . To accommodate the process of optimization and fast finding of suboptimal matrices, we use alternating least square scheme for updating the corresponding matrices. ar X iv :1 70 5. 10 45 5v 1 [ cs .S I] 3 0 M ay 2 01 7 Table 1: Description of dataset # of Trending Hashtags 6 # of Users 212,062 # of Tweets 425,731 • We collect and build a dataset of tweets of 6 different trending hashtags to evaluate the proposed model and demonstrate its ability to predict the hashtag adoption by users. This paper is organized as follows. In the first section, we explain our data crawling methodology. Next, we provide formal definition of the problem in hand and notations used throughout the paper and detail our matrix factorization framework and its time complexity. We conduct the experiments and discuss the results in the next section and finally conclude the paper with conclusion and future work section.
منابع مشابه
A New Perspective on Twitter Hashtag Use: Diffusion of Innovation Theory
Twitter is a fast growing real-time social media tool. As Twitter evolves, more and more people are partaking in sharing what is happening around the world through various Twitter applications. Hashtag use has become a unique tagging convention to help associate Twitter messages with certain events or contexts. Prefixed by a # symbol with a keyword, a Twitter hashtag serves as a bottom-up userp...
متن کاملHow Others Affect Your Twitter #hashtag Adoption? Examination of Community- based and Context-based Information Diffusion in Twitter
Twitter has become a rich source of people’s opinions about a variety of topics, such as their daily life, and current news. Twitter’s retweeting and mentioning mechanisms enable users to disseminate information broadly. In this study, we investigate the effects of community-based and context-based features on the users’ information adoption and diffusion patterns in Twitter. Community-based fe...
متن کاملTwitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling
Twitter data is extremely noisy – each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network i...
متن کاملCharacterizing Topic-Specific Hashtag Cascade in Twitter Based on Distributions of User Influence
As online social networks become extremely popular in these days, people communicate and exchange information for various purposes. In this paper, we investigate patterns of information diffusion and behaviors of participating users in Twitter, which would be useful to verify the effectiveness of marketing and publicity campaigns. We characterize Twitter hashtag cascades corresponding to differ...
متن کاملTwitter Hash Tag Recommendation
The rise in popularity of microblogging services like Twitter has led to increased use of content annotation strategies like the hashtag. Hashtags provide users with a tagging mechanism to help organize, group, and create visibility for their posts. This is a simple idea but can be challenging for the user in practice which leads to infrequent usage. In this paper, we will investigate various m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1705.10455 شماره
صفحات -
تاریخ انتشار 2017